99 research outputs found
Super-Fast 3-Ruling Sets
A -ruling set of a graph is a vertex-subset
that is independent and satisfies the property that every vertex is
at a distance of at most from some vertex in . A \textit{maximal
independent set (MIS)} is a 1-ruling set. The problem of computing an MIS on a
network is a fundamental problem in distributed algorithms and the fastest
algorithm for this problem is the -round algorithm due to Luby
(SICOMP 1986) and Alon et al. (J. Algorithms 1986) from more than 25 years ago.
Since then the problem has resisted all efforts to yield to a sub-logarithmic
algorithm. There has been recent progress on this problem, most importantly an
-round algorithm on graphs with
vertices and maximum degree , due to Barenboim et al. (Barenboim,
Elkin, Pettie, and Schneider, April 2012, arxiv 1202.1983; to appear FOCS
2012).
We approach the MIS problem from a different angle and ask if O(1)-ruling
sets can be computed much more efficiently than an MIS? As an answer to this
question, we show how to compute a 2-ruling set of an -vertex graph in
rounds. We also show that the above result can be improved
for special classes of graphs such as graphs with high girth, trees, and graphs
of bounded arboricity.
Our main technique involves randomized sparsification that rapidly reduces
the graph degree while ensuring that every deleted vertex is close to some
vertex that remains. This technique may have further applications in other
contexts, e.g., in designing sub-logarithmic distributed approximation
algorithms. Our results raise intriguing questions about how quickly an MIS (or
1-ruling sets) can be computed, given that 2-ruling sets can be computed in
sub-logarithmic rounds
Super-Fast Distributed Algorithms for Metric Facility Location
This paper presents a distributed O(1)-approximation algorithm, with
expected- running time, in the model for
the metric facility location problem on a size- clique network. Though
metric facility location has been considered by a number of researchers in
low-diameter settings, this is the first sub-logarithmic-round algorithm for
the problem that yields an O(1)-approximation in the setting of non-uniform
facility opening costs. In order to obtain this result, our paper makes three
main technical contributions. First, we show a new lower bound for metric
facility location, extending the lower bound of B\u{a}doiu et al. (ICALP 2005)
that applies only to the special case of uniform facility opening costs. Next,
we demonstrate a reduction of the distributed metric facility location problem
to the problem of computing an O(1)-ruling set of an appropriate spanning
subgraph. Finally, we present a sub-logarithmic-round (in expectation)
algorithm for computing a 2-ruling set in a spanning subgraph of a clique. Our
algorithm accomplishes this by using a combination of randomized and
deterministic sparsification.Comment: 15 pages, 2 figures. This is the full version of a paper that
appeared in ICALP 201
On the Analysis of a Label Propagation Algorithm for Community Detection
This paper initiates formal analysis of a simple, distributed algorithm for
community detection on networks. We analyze an algorithm that we call
\textsc{Max-LPA}, both in terms of its convergence time and in terms of the
"quality" of the communities detected. \textsc{Max-LPA} is an instance of a
class of community detection algorithms called \textit{label propagation}
algorithms. As far as we know, most analysis of label propagation algorithms
thus far has been empirical in nature and in this paper we seek a theoretical
understanding of label propagation algorithms. In our main result, we define a
clustered version of \er random graphs with clusters where
the probability , of an edge connecting nodes within a cluster is
higher than , the probability of an edge connecting nodes in distinct
clusters. We show that even with fairly general restrictions on and
( for any , , where is the number of nodes), \textsc{Max-LPA} detects the
clusters in just two rounds. Based on this and on empirical
results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify
communities on clustered \er graphs even when the clusters are much sparser,
i.e., with for some .Comment: 17 pages. Submitted to ICDCN 201
Using Read-k Inequalities to Analyze a Distributed MIS Algorithm
Until recently, the fastest distributed MIS algorithm, even for simple
graphs, e.g., unoriented trees has been the simple randomized algorithm
discovered the 80s. This algorithm (commonly called Luby's algorithm) computes
an MIS in rounds (with high probability). This situation changed
when Lenzen and Wattenhofer (PODC 2011) presented a randomized -round MIS algorithm for unoriented trees. This algorithm
was improved by Barenboim et al. (FOCS 2012), resulting in an -round MIS algorithm.
The analyses of these tree MIS algorithms depends on "near independence" of
probabilistic events, a feature of the tree structure of the network. In their
paper, Lenzen and Wattenhofer hope that their algorithm and analysis could be
extended to graphs with bounded arboricity. We show how to do this. By using a
new tail inequality for read-k families of random variables due to Gavinsky et
al. (Random Struct Algorithms, 2015), we show how to deal with dependencies
induced by the recent tree MIS algorithms when they are executed on bounded
arboricity graphs. Specifically, we analyze a version of the tree MIS algorithm
of Barenboim et al. and show that it runs in O(\mbox{poly}(\alpha) \cdot
\sqrt{\log n \cdot \log\log n}) rounds in the model for
graphs with arboricity .
While the main thrust of this paper is the new probabilistic analysis via
read- inequalities, for small values of , this algorithm is faster
than the bounded arboricity MIS algorithm of Barenboim et al. We also note that
recently (SODA 2016), Gaffari presented a novel MIS algorithm for general
graphs that runs in rounds; a
corollary of this algorithm is an -round MIS
algorithm on arboricity- graphs.Comment: To appear in PODC 2016 as a brief announcemen
Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages
In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph.
This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages.
Our primary tools in achieving these results are
(i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and
(ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges
Analysis of the Worst Case Space Complexity of a PR Quadtree
We demonstrate that a resolution-r PR quadtree containing n points has, in the worst case, at most nodes. This captures the fact that as n tends towards 4r, the number of nodes in a PR quadtree quickly approaches O(n). This is a more precise estimation of the worst case space requirement of a PR quadtree than has been attempted before
Sample-And-Gather: Fast Ruling Set Algorithms in the Low-Memory MPC Model
Motivated by recent progress on symmetry breaking problems such as maximal independent set (MIS) and maximal matching in the low-memory Massively Parallel Computation (MPC) model (e.g., Behnezhad et al. PODC 2019; Ghaffari-Uitto SODA 2019), we investigate the complexity of ruling set problems in this model. The MPC model has become very popular as a model for large-scale distributed computing and it comes with the constraint that the memory-per-machine is strongly sublinear in the input size. For graph problems, extremely fast MPC algorithms have been designed assuming ??(n) memory-per-machine, where n is the number of nodes in the graph (e.g., the O(log log n) MIS algorithm of Ghaffari et al., PODC 2018). However, it has proven much more difficult to design fast MPC algorithms for graph problems in the low-memory MPC model, where the memory-per-machine is restricted to being strongly sublinear in the number of nodes, i.e., O(n^?) for constant 0 < ? < 1.
In this paper, we present an algorithm for the 2-ruling set problem, running in O?(log^{1/6} ?) rounds whp, in the low-memory MPC model. Here ? is the maximum degree of the graph. We then extend this result to ?-ruling sets for any integer ? > 1. Specifically, we show that a ?-ruling set can be computed in the low-memory MPC model with O(n^?) memory-per-machine in O?(? ? log^{1/(2^{?+1}-2)} ?) rounds, whp. From this it immediately follows that a ?-ruling set for ? = ?(log log log ?)-ruling set can be computed in in just O(? log log n) rounds whp. The above results assume a total memory of O?(m + n^{1+?}). We also present algorithms for ?-ruling sets in the low-memory MPC model assuming that the total memory over all machines is restricted to O?(m). For ? > 1, these algorithms are all substantially faster than the Ghaffari-Uitto O?(?{log ?})-round MIS algorithm in the low-memory MPC model.
All our results follow from a Sample-and-Gather Simulation Theorem that shows how random-sampling-based Congest algorithms can be efficiently simulated in the low-memory MPC model. We expect this simulation theorem to be of independent interest beyond the ruling set algorithms derived here
Large-Scale Distributed Algorithms for Facility Location with Outliers
This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are:
- Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
- Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation
- …